External & Intrinsic Plagiarism Detection: VSM & Discourse Markers based Approach - Notebook for PAN at CLEF 2011

نویسندگان

  • Sameer Rao
  • Parth Gupta
  • Khushboo Singhal
  • Prasenjit Majumder
چکیده

This paper aims to explain the performance of plagiarism detection system which can detect External as well as Intrinsic Plagiarism in text. It reports the results on PAN-PC-2011 test corpus. We investigated Vector Space Model based techniques for detecting external plagiarism cases and discourse markers based features to detect intrinsic plagiarism cases.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Approaches for Intrinsic and External Plagiarism Detection - Notebook for PAN at CLEF 2011

Plagiarism detection has been considered as a classification problem which can be approximated with intrinsic strategies, considering self-based information from a given document, and external strategies, considering comparison techniques between a suspicious document and different sources. In this work, both intrinsic and external approaches for plagiarism detection are presented. First, the m...

متن کامل

Using WordNet-based Semantic Similarity Measurement in External Plagiarism Detection - Notebook for PAN at CLEF 2011

Continuing our previous work started at PAN 2009 and PAN 2010 [7] we considered further research options based on the achieved baseline of the best performing algorithms. The research done by Potthast et al. [4] presented a sliced view of the presented approaches showing their performance on specific corpus metrics external\intrinsic, obfuscation strategies (none, artificial high\low, simulated...

متن کامل

External Plagiarism Detection using Information Retrieval and Sequence Alignment - Notebook for PAN at CLEF 2011

This paper describes the University of Sheffield entry for the 3rd International Competition on Plagiarism Detection which attempted the monolingual external plagiarism detection task. A three stage framework was used: preprocessing and indexing, candidate document selection (using an Information Retrieval based approach) and detailed analysis (using the Running Karp-Rabin Greedy String Tiling ...

متن کامل

Overview of the 3rd International Competition on Plagiarism Detection

This paper overviews eleven plagiarism detectors that have been developed and evaluated within PAN’11. We survey the detection approaches developed for the two sub-tasks “external plagiarism detection” and “intrinsic plagiarism detection,” and we report on their detailed evaluation based on the third revised edition of the PAN plagiarism corpus PAN-PC-11.

متن کامل

The Encoplot Similarity Measure for Automatic Detection of Plagiarism - Notebook for PAN at CLEF 2011

This paper describes the evolution of our method Encoplot for automatic plagiarism detection and the results of the participation to the PAN’11 competition. The main novelties are the introduction of a new similarity measure and of a new ranking method, which cooperate to rank much better the source– suspicious document pairs when selecting the candidates for the detailed analysis phase. We hav...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2011